IBM | Data Engineer Interview Experience | 2 YoE



Round 1: Technical

The interview started with a project description. Although the role was for a GCP Data Engineer, my project was based on Azure. However, I assured the interviewers that I had worked with GCP for practice and that I could quickly adapt to it.

The questions primarily focused on Spark, SQL, and Hive. Some of the questions I recall include:

📍 Basic Spark syntax, such as filtering, joining, and displaying the top N rows of a dataset.

📍 How to display output as a DataFrame when using SQL syntax in Spark.

📍 The difference between internal and external tables.

📍 We inserted data into a Hive table but couldn't see it when performing a SELECT * FROM table. How can we resolve this issue?

📍 How to find the second-highest salary of an employee along with their name and department using SQL?

Round 2: Managerial

The second round was managerial. It began with a description of my background, previous work experience, and project details.

📍 Why do we prefer to use Databricks over other applications?

📍 How was work assigned to me in my previous project?

After these two rounds, the HR team reached out to discuss the salary. Once we agreed, the offer letter was expected to be issued within a week.